7 - Artificial Intelligence II [ID:47298]
50 von 904 angezeigt

Okay, this better?

Good.

So, today we're starting to talk about decision theory.

Okay?

Remember we have these agents, utility-based agents, is what we're striving towards.

And we've worked hard in the last weeks to understand how to build world models and

judge world models in the presence of partial observability and non-static worlds maybe and

non-deterministic actions.

Okay?

Basically, it's about managing a belief state, which is a set of possible states graded

by a likelihood function or probability function.

And that's what we looked at.

And essentially, the world model in the upper part of this utility-based agent is all about

a Bayesian network that models the part of the world that we want to model in our agent.

The lower part is, right, what do my actions do and how happy will I be in the state that

is reached by the action?

That's what we call decision theory, the theory of making decisions under uncertainty.

And that's what we're going to look into.

And we're pursuing this definition of rationality, which we started off, which is essentially, we

want to maximize the expected utility of an action.

What we did last semester was essentially exactly that, right?

We wanted to maximize the heuristic value, that's what we did, of an action, namely going

to Zarend or what else did we have as an example, right?

And we have an action we can, since we're in a fully observable deterministic environment,

we can compute the state, we're in a state S. We have some evidence about the world,

what comes in as a sensor.

And that is complete because we have a fully observable world, so we can actually, we can

actually use the sensor model to compute the state we must be in, right?

And then we can just see what, how the world changes if we have an action.

And then we can just look up the utility of that.

If we have a utility or if we have a heuristic value, all of those things are computable,

so this is a deterministic, the expected utility of a state of an action A is a thing we

can just compute.

And therefore we can just have a list of the utilities of our actions, and then we'll

just take the best one, okay?

And then we can do it in a greedy way or in a star way and so on just to see what is

best.

Now, all of the things we're using here, we don't have this semester, right?

We do not have a fully observable world anymore, we do not have a deterministic, we do not

have deterministic actions, and we might even have faulty sensors.

And we might have time in which the world changes without us changing it, okay?

So we have to go from the next state after a sensor, right?

I'd have faulty sensors, and they're certainly not total, so we do not know what the current

state is, and we do not know what the result state of an action is.

So what we can do instead is we basically make the result state of an action into a random

variable, right?

It is something that behaves in a process that we do not fully understand, but we might

have probabilistic reasoning on that, right?

We might know the probability of given that the result of an action is S prime, given the

evidence we've get from our sensors, partial evidence, and the fact that we're doing an action,

Teil einer Videoserie :

Zugänglich über

Offener Zugang

Dauer

01:32:50 Min

Aufnahmedatum

2023-05-09

Hochgeladen am

2023-05-10 14:09:32

Sprache

en-US

Einbetten
Wordpress FAU Plugin
iFrame
Teilen